Overview

Dataset statistics

Number of variables14
Number of observations8693
Missing cells2324
Missing cells (%)1.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory891.5 KiB
Average record size in memory105.0 B

Variable types

Text3
Categorical2
Boolean3
Numeric6

Alerts

VIP is highly imbalanced (84.0%)Imbalance
HomePlanet has 201 (2.3%) missing valuesMissing
CryoSleep has 217 (2.5%) missing valuesMissing
Cabin has 199 (2.3%) missing valuesMissing
Destination has 182 (2.1%) missing valuesMissing
Age has 179 (2.1%) missing valuesMissing
VIP has 203 (2.3%) missing valuesMissing
RoomService has 181 (2.1%) missing valuesMissing
FoodCourt has 183 (2.1%) missing valuesMissing
ShoppingMall has 208 (2.4%) missing valuesMissing
Spa has 183 (2.1%) missing valuesMissing
VRDeck has 188 (2.2%) missing valuesMissing
Name has 200 (2.3%) missing valuesMissing
PassengerId has unique valuesUnique
Age has 178 (2.0%) zerosZeros
RoomService has 5577 (64.2%) zerosZeros
FoodCourt has 5456 (62.8%) zerosZeros
ShoppingMall has 5587 (64.3%) zerosZeros
Spa has 5324 (61.2%) zerosZeros
VRDeck has 5495 (63.2%) zerosZeros

Reproduction

Analysis started2023-12-11 11:07:34.946827
Analysis finished2023-12-11 11:07:38.372970
Duration3.43 seconds
Software versionydata-profiling vv4.6.3
Download configurationconfig.json

Variables

PassengerId
Text

UNIQUE 

Distinct8693
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size68.0 KiB
2023-12-11T12:07:38.502427image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters60851
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8693 ?
Unique (%)100.0%

Sample

1st row0001_01
2nd row0002_01
3rd row0003_01
4th row0003_02
5th row0004_01
ValueCountFrequency (%)
0024_01 1
 
< 0.1%
9280_02 1
 
< 0.1%
0001_01 1
 
< 0.1%
0002_01 1
 
< 0.1%
0003_01 1
 
< 0.1%
0003_02 1
 
< 0.1%
0004_01 1
 
< 0.1%
0005_01 1
 
< 0.1%
0006_01 1
 
< 0.1%
0006_02 1
 
< 0.1%
Other values (8683) 8683
99.9%
2023-12-11T12:07:38.742554image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 12459
20.5%
1 9827
16.1%
_ 8693
14.3%
2 5017
8.2%
3 4039
 
6.6%
4 3790
 
6.2%
6 3664
 
6.0%
5 3606
 
5.9%
8 3557
 
5.8%
7 3410
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 52158
85.7%
Connector Punctuation 8693
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 12459
23.9%
1 9827
18.8%
2 5017
9.6%
3 4039
 
7.7%
4 3790
 
7.3%
6 3664
 
7.0%
5 3606
 
6.9%
8 3557
 
6.8%
7 3410
 
6.5%
9 2789
 
5.3%
Connector Punctuation
ValueCountFrequency (%)
_ 8693
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 60851
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 12459
20.5%
1 9827
16.1%
_ 8693
14.3%
2 5017
8.2%
3 4039
 
6.6%
4 3790
 
6.2%
6 3664
 
6.0%
5 3606
 
5.9%
8 3557
 
5.8%
7 3410
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 60851
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 12459
20.5%
1 9827
16.1%
_ 8693
14.3%
2 5017
8.2%
3 4039
 
6.6%
4 3790
 
6.2%
6 3664
 
6.0%
5 3606
 
5.9%
8 3557
 
5.8%
7 3410
 
5.6%

HomePlanet
Categorical

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing201
Missing (%)2.3%
Memory size68.0 KiB
Earth
4602 
Europa
2131 
Mars
1759 

Length

Max length6
Median length5
Mean length5.0438059
Min length4

Characters and Unicode

Total characters42832
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEuropa
2nd rowEarth
3rd rowEuropa
4th rowEuropa
5th rowEarth

Common Values

ValueCountFrequency (%)
Earth 4602
52.9%
Europa 2131
24.5%
Mars 1759
 
20.2%
(Missing) 201
 
2.3%

Length

2023-12-11T12:07:38.866221image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:07:38.966579image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
earth 4602
54.2%
europa 2131
25.1%
mars 1759
 
20.7%

Most occurring characters

ValueCountFrequency (%)
a 8492
19.8%
r 8492
19.8%
E 6733
15.7%
t 4602
10.7%
h 4602
10.7%
u 2131
 
5.0%
o 2131
 
5.0%
p 2131
 
5.0%
M 1759
 
4.1%
s 1759
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 34340
80.2%
Uppercase Letter 8492
 
19.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8492
24.7%
r 8492
24.7%
t 4602
13.4%
h 4602
13.4%
u 2131
 
6.2%
o 2131
 
6.2%
p 2131
 
6.2%
s 1759
 
5.1%
Uppercase Letter
ValueCountFrequency (%)
E 6733
79.3%
M 1759
 
20.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 42832
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8492
19.8%
r 8492
19.8%
E 6733
15.7%
t 4602
10.7%
h 4602
10.7%
u 2131
 
5.0%
o 2131
 
5.0%
p 2131
 
5.0%
M 1759
 
4.1%
s 1759
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42832
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8492
19.8%
r 8492
19.8%
E 6733
15.7%
t 4602
10.7%
h 4602
10.7%
u 2131
 
5.0%
o 2131
 
5.0%
p 2131
 
5.0%
M 1759
 
4.1%
s 1759
 
4.1%

CryoSleep
Boolean

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing217
Missing (%)2.5%
Memory size68.0 KiB
False
5439 
True
3037 
(Missing)
 
217
ValueCountFrequency (%)
False 5439
62.6%
True 3037
34.9%
(Missing) 217
 
2.5%
2023-12-11T12:07:39.036340image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Cabin
Text

MISSING 

Distinct6560
Distinct (%)77.2%
Missing199
Missing (%)2.3%
Memory size68.0 KiB
2023-12-11T12:07:39.179932image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.0775842
Min length5

Characters and Unicode

Total characters60117
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5427 ?
Unique (%)63.9%

Sample

1st rowB/0/P
2nd rowF/0/S
3rd rowA/0/S
4th rowA/0/S
5th rowF/1/S
ValueCountFrequency (%)
g/734/s 8
 
0.1%
c/137/s 7
 
0.1%
g/1476/s 7
 
0.1%
b/11/s 7
 
0.1%
f/1194/p 7
 
0.1%
b/82/s 7
 
0.1%
d/176/s 7
 
0.1%
g/981/s 7
 
0.1%
e/13/s 7
 
0.1%
f/1411/p 7
 
0.1%
Other values (6550) 8423
99.2%
2023-12-11T12:07:39.441166image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 16988
28.3%
1 5326
 
8.9%
S 4288
 
7.1%
P 4206
 
7.0%
2 3078
 
5.1%
F 2794
 
4.6%
3 2601
 
4.3%
G 2559
 
4.3%
4 2393
 
4.0%
5 2377
 
4.0%
Other values (11) 13507
22.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26141
43.5%
Other Punctuation 16988
28.3%
Uppercase Letter 16988
28.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 5326
20.4%
2 3078
11.8%
3 2601
9.9%
4 2393
9.2%
5 2377
9.1%
6 2176
8.3%
7 2166
8.3%
8 2093
 
8.0%
9 1982
 
7.6%
0 1949
 
7.5%
Uppercase Letter
ValueCountFrequency (%)
S 4288
25.2%
P 4206
24.8%
F 2794
16.4%
G 2559
15.1%
E 876
 
5.2%
B 779
 
4.6%
C 747
 
4.4%
D 478
 
2.8%
A 256
 
1.5%
T 5
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
/ 16988
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 43129
71.7%
Latin 16988
 
28.3%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 16988
39.4%
1 5326
 
12.3%
2 3078
 
7.1%
3 2601
 
6.0%
4 2393
 
5.5%
5 2377
 
5.5%
6 2176
 
5.0%
7 2166
 
5.0%
8 2093
 
4.9%
9 1982
 
4.6%
Latin
ValueCountFrequency (%)
S 4288
25.2%
P 4206
24.8%
F 2794
16.4%
G 2559
15.1%
E 876
 
5.2%
B 779
 
4.6%
C 747
 
4.4%
D 478
 
2.8%
A 256
 
1.5%
T 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 60117
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 16988
28.3%
1 5326
 
8.9%
S 4288
 
7.1%
P 4206
 
7.0%
2 3078
 
5.1%
F 2794
 
4.6%
3 2601
 
4.3%
G 2559
 
4.3%
4 2393
 
4.0%
5 2377
 
4.0%
Other values (11) 13507
22.5%

Destination
Categorical

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing182
Missing (%)2.1%
Memory size68.0 KiB
TRAPPIST-1e
5915 
55 Cancri e
1800 
PSO J318.5-22
796 

Length

Max length13
Median length11
Mean length11.187052
Min length11

Characters and Unicode

Total characters95213
Distinct characters23
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTRAPPIST-1e
2nd rowTRAPPIST-1e
3rd rowTRAPPIST-1e
4th rowTRAPPIST-1e
5th rowTRAPPIST-1e

Common Values

ValueCountFrequency (%)
TRAPPIST-1e 5915
68.0%
55 Cancri e 1800
 
20.7%
PSO J318.5-22 796
 
9.2%
(Missing) 182
 
2.1%

Length

2023-12-11T12:07:39.550401image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:07:39.632260image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
trappist-1e 5915
45.8%
55 1800
 
13.9%
cancri 1800
 
13.9%
e 1800
 
13.9%
pso 796
 
6.2%
j318.5-22 796
 
6.2%

Most occurring characters

ValueCountFrequency (%)
P 12626
13.3%
T 11830
12.4%
e 7715
 
8.1%
- 6711
 
7.0%
S 6711
 
7.0%
1 6711
 
7.0%
R 5915
 
6.2%
I 5915
 
6.2%
A 5915
 
6.2%
5 4396
 
4.6%
Other values (13) 20768
21.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 52304
54.9%
Lowercase Letter 16715
 
17.6%
Decimal Number 14291
 
15.0%
Dash Punctuation 6711
 
7.0%
Space Separator 4396
 
4.6%
Other Punctuation 796
 
0.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 12626
24.1%
T 11830
22.6%
S 6711
12.8%
R 5915
11.3%
I 5915
11.3%
A 5915
11.3%
C 1800
 
3.4%
O 796
 
1.5%
J 796
 
1.5%
Lowercase Letter
ValueCountFrequency (%)
e 7715
46.2%
a 1800
 
10.8%
n 1800
 
10.8%
c 1800
 
10.8%
r 1800
 
10.8%
i 1800
 
10.8%
Decimal Number
ValueCountFrequency (%)
1 6711
47.0%
5 4396
30.8%
2 1592
 
11.1%
3 796
 
5.6%
8 796
 
5.6%
Dash Punctuation
ValueCountFrequency (%)
- 6711
100.0%
Space Separator
ValueCountFrequency (%)
4396
100.0%
Other Punctuation
ValueCountFrequency (%)
. 796
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 69019
72.5%
Common 26194
 
27.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 12626
18.3%
T 11830
17.1%
e 7715
11.2%
S 6711
9.7%
R 5915
8.6%
I 5915
8.6%
A 5915
8.6%
C 1800
 
2.6%
a 1800
 
2.6%
n 1800
 
2.6%
Other values (5) 6992
10.1%
Common
ValueCountFrequency (%)
- 6711
25.6%
1 6711
25.6%
5 4396
16.8%
4396
16.8%
2 1592
 
6.1%
3 796
 
3.0%
8 796
 
3.0%
. 796
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 95213
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 12626
13.3%
T 11830
12.4%
e 7715
 
8.1%
- 6711
 
7.0%
S 6711
 
7.0%
1 6711
 
7.0%
R 5915
 
6.2%
I 5915
 
6.2%
A 5915
 
6.2%
5 4396
 
4.6%
Other values (13) 20768
21.8%

Age
Real number (ℝ)

MISSING  ZEROS 

Distinct80
Distinct (%)0.9%
Missing179
Missing (%)2.1%
Infinite0
Infinite (%)0.0%
Mean28.82793
Minimum0
Maximum79
Zeros178
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size68.0 KiB
2023-12-11T12:07:39.737276image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q119
median27
Q338
95-th percentile56
Maximum79
Range79
Interquartile range (IQR)19

Descriptive statistics

Standard deviation14.489021
Coefficient of variation (CV)0.50260359
Kurtosis0.10193292
Mean28.82793
Median Absolute Deviation (MAD)9
Skewness0.41909658
Sum245441
Variance209.93174
MonotonicityNot monotonic
2023-12-11T12:07:39.846624image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24 324
 
3.7%
18 320
 
3.7%
21 311
 
3.6%
19 293
 
3.4%
23 292
 
3.4%
22 291
 
3.3%
20 277
 
3.2%
26 268
 
3.1%
28 267
 
3.1%
27 259
 
3.0%
Other values (70) 5612
64.6%
ValueCountFrequency (%)
0 178
2.0%
1 67
 
0.8%
2 75
0.9%
3 75
0.9%
4 71
 
0.8%
5 33
 
0.4%
6 40
 
0.5%
7 52
 
0.6%
8 46
 
0.5%
9 42
 
0.5%
ValueCountFrequency (%)
79 3
 
< 0.1%
78 3
 
< 0.1%
77 2
 
< 0.1%
76 2
 
< 0.1%
75 4
< 0.1%
74 5
0.1%
73 7
0.1%
72 4
< 0.1%
71 7
0.1%
70 9
0.1%

VIP
Boolean

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing203
Missing (%)2.3%
Memory size68.0 KiB
False
8291 
True
 
199
(Missing)
 
203
ValueCountFrequency (%)
False 8291
95.4%
True 199
 
2.3%
(Missing) 203
 
2.3%
2023-12-11T12:07:39.932633image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

RoomService
Real number (ℝ)

MISSING  ZEROS 

Distinct1273
Distinct (%)15.0%
Missing181
Missing (%)2.1%
Infinite0
Infinite (%)0.0%
Mean224.68762
Minimum0
Maximum14327
Zeros5577
Zeros (%)64.2%
Negative0
Negative (%)0.0%
Memory size68.0 KiB
2023-12-11T12:07:40.009632image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q347
95-th percentile1274.25
Maximum14327
Range14327
Interquartile range (IQR)47

Descriptive statistics

Standard deviation666.71766
Coefficient of variation (CV)2.9673093
Kurtosis65.273802
Mean224.68762
Median Absolute Deviation (MAD)0
Skewness6.3330141
Sum1912541
Variance444512.44
MonotonicityNot monotonic
2023-12-11T12:07:40.110859image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5577
64.2%
1 117
 
1.3%
2 79
 
0.9%
3 61
 
0.7%
4 47
 
0.5%
5 28
 
0.3%
9 25
 
0.3%
8 24
 
0.3%
6 24
 
0.3%
14 21
 
0.2%
Other values (1263) 2509
28.9%
(Missing) 181
 
2.1%
ValueCountFrequency (%)
0 5577
64.2%
1 117
 
1.3%
2 79
 
0.9%
3 61
 
0.7%
4 47
 
0.5%
5 28
 
0.3%
6 24
 
0.3%
7 17
 
0.2%
8 24
 
0.3%
9 25
 
0.3%
ValueCountFrequency (%)
14327 1
< 0.1%
9920 1
< 0.1%
8586 1
< 0.1%
8243 1
< 0.1%
8209 1
< 0.1%
8168 1
< 0.1%
8151 1
< 0.1%
8142 1
< 0.1%
8030 1
< 0.1%
7406 1
< 0.1%

FoodCourt
Real number (ℝ)

MISSING  ZEROS 

Distinct1507
Distinct (%)17.7%
Missing183
Missing (%)2.1%
Infinite0
Infinite (%)0.0%
Mean458.0772
Minimum0
Maximum29813
Zeros5456
Zeros (%)62.8%
Negative0
Negative (%)0.0%
Memory size68.0 KiB
2023-12-11T12:07:40.206415image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q376
95-th percentile2748.5
Maximum29813
Range29813
Interquartile range (IQR)76

Descriptive statistics

Standard deviation1611.4892
Coefficient of variation (CV)3.5179425
Kurtosis73.30723
Mean458.0772
Median Absolute Deviation (MAD)0
Skewness7.1022279
Sum3898237
Variance2596897.6
MonotonicityNot monotonic
2023-12-11T12:07:40.306996image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5456
62.8%
1 116
 
1.3%
2 75
 
0.9%
4 53
 
0.6%
3 53
 
0.6%
5 33
 
0.4%
6 31
 
0.4%
9 28
 
0.3%
10 27
 
0.3%
7 27
 
0.3%
Other values (1497) 2611
30.0%
(Missing) 183
 
2.1%
ValueCountFrequency (%)
0 5456
62.8%
1 116
 
1.3%
2 75
 
0.9%
3 53
 
0.6%
4 53
 
0.6%
5 33
 
0.4%
6 31
 
0.4%
7 27
 
0.3%
8 20
 
0.2%
9 28
 
0.3%
ValueCountFrequency (%)
29813 1
< 0.1%
27723 1
< 0.1%
27071 1
< 0.1%
26830 1
< 0.1%
21066 1
< 0.1%
18481 1
< 0.1%
17958 1
< 0.1%
17901 1
< 0.1%
17687 1
< 0.1%
17432 1
< 0.1%

ShoppingMall
Real number (ℝ)

MISSING  ZEROS 

Distinct1115
Distinct (%)13.1%
Missing208
Missing (%)2.4%
Infinite0
Infinite (%)0.0%
Mean173.72917
Minimum0
Maximum23492
Zeros5587
Zeros (%)64.3%
Negative0
Negative (%)0.0%
Memory size68.0 KiB
2023-12-11T12:07:40.407996image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q327
95-th percentile927.8
Maximum23492
Range23492
Interquartile range (IQR)27

Descriptive statistics

Standard deviation604.69646
Coefficient of variation (CV)3.4806847
Kurtosis328.87091
Mean173.72917
Median Absolute Deviation (MAD)0
Skewness12.627562
Sum1474092
Variance365657.81
MonotonicityNot monotonic
2023-12-11T12:07:40.511551image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5587
64.3%
1 153
 
1.8%
2 80
 
0.9%
3 59
 
0.7%
4 45
 
0.5%
5 38
 
0.4%
7 36
 
0.4%
6 34
 
0.4%
13 29
 
0.3%
9 28
 
0.3%
Other values (1105) 2396
27.6%
(Missing) 208
 
2.4%
ValueCountFrequency (%)
0 5587
64.3%
1 153
 
1.8%
2 80
 
0.9%
3 59
 
0.7%
4 45
 
0.5%
5 38
 
0.4%
6 34
 
0.4%
7 36
 
0.4%
8 28
 
0.3%
9 28
 
0.3%
ValueCountFrequency (%)
23492 1
< 0.1%
12253 1
< 0.1%
10705 1
< 0.1%
10424 1
< 0.1%
9058 1
< 0.1%
7810 1
< 0.1%
7185 1
< 0.1%
7148 1
< 0.1%
7104 1
< 0.1%
6805 1
< 0.1%

Spa
Real number (ℝ)

MISSING  ZEROS 

Distinct1327
Distinct (%)15.6%
Missing183
Missing (%)2.1%
Infinite0
Infinite (%)0.0%
Mean311.13878
Minimum0
Maximum22408
Zeros5324
Zeros (%)61.2%
Negative0
Negative (%)0.0%
Memory size68.0 KiB
2023-12-11T12:07:40.621432image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q359
95-th percentile1607.1
Maximum22408
Range22408
Interquartile range (IQR)59

Descriptive statistics

Standard deviation1136.7055
Coefficient of variation (CV)3.6533715
Kurtosis81.20211
Mean311.13878
Median Absolute Deviation (MAD)0
Skewness7.6360199
Sum2647791
Variance1292099.5
MonotonicityNot monotonic
2023-12-11T12:07:40.714690image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5324
61.2%
1 146
 
1.7%
2 105
 
1.2%
3 53
 
0.6%
5 53
 
0.6%
4 46
 
0.5%
7 34
 
0.4%
6 33
 
0.4%
8 28
 
0.3%
9 28
 
0.3%
Other values (1317) 2660
30.6%
(Missing) 183
 
2.1%
ValueCountFrequency (%)
0 5324
61.2%
1 146
 
1.7%
2 105
 
1.2%
3 53
 
0.6%
4 46
 
0.5%
5 53
 
0.6%
6 33
 
0.4%
7 34
 
0.4%
8 28
 
0.3%
9 28
 
0.3%
ValueCountFrequency (%)
22408 1
< 0.1%
18572 1
< 0.1%
16594 1
< 0.1%
16139 1
< 0.1%
15586 1
< 0.1%
15331 1
< 0.1%
15238 1
< 0.1%
14970 1
< 0.1%
13995 1
< 0.1%
13902 1
< 0.1%

VRDeck
Real number (ℝ)

MISSING  ZEROS 

Distinct1306
Distinct (%)15.4%
Missing188
Missing (%)2.2%
Infinite0
Infinite (%)0.0%
Mean304.85479
Minimum0
Maximum24133
Zeros5495
Zeros (%)63.2%
Negative0
Negative (%)0.0%
Memory size68.0 KiB
2023-12-11T12:07:40.806565image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q346
95-th percentile1534.2
Maximum24133
Range24133
Interquartile range (IQR)46

Descriptive statistics

Standard deviation1145.7172
Coefficient of variation (CV)3.7582391
Kurtosis86.011186
Mean304.85479
Median Absolute Deviation (MAD)0
Skewness7.8197316
Sum2592790
Variance1312667.9
MonotonicityNot monotonic
2023-12-11T12:07:40.908925image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5495
63.2%
1 139
 
1.6%
2 70
 
0.8%
3 56
 
0.6%
5 51
 
0.6%
4 47
 
0.5%
6 32
 
0.4%
8 30
 
0.3%
7 29
 
0.3%
9 25
 
0.3%
Other values (1296) 2531
29.1%
(Missing) 188
 
2.2%
ValueCountFrequency (%)
0 5495
63.2%
1 139
 
1.6%
2 70
 
0.8%
3 56
 
0.6%
4 47
 
0.5%
5 51
 
0.6%
6 32
 
0.4%
7 29
 
0.3%
8 30
 
0.3%
9 25
 
0.3%
ValueCountFrequency (%)
24133 1
< 0.1%
20336 1
< 0.1%
17306 1
< 0.1%
17074 1
< 0.1%
16337 1
< 0.1%
14485 1
< 0.1%
12708 1
< 0.1%
12685 1
< 0.1%
12682 1
< 0.1%
12424 1
< 0.1%

Name
Text

MISSING 

Distinct8473
Distinct (%)99.8%
Missing200
Missing (%)2.3%
Memory size68.0 KiB
2023-12-11T12:07:41.041255image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Length

Max length18
Median length15
Mean length13.833628
Min length7

Characters and Unicode

Total characters117489
Distinct characters53
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8453 ?
Unique (%)99.5%

Sample

1st rowMaham Ofracculy
2nd rowJuanna Vines
3rd rowAltark Susent
4th rowSolam Susent
5th rowWilly Santantines
ValueCountFrequency (%)
willy 20
 
0.1%
casonston 18
 
0.1%
oneiles 16
 
0.1%
domington 15
 
0.1%
litthews 15
 
0.1%
garnes 14
 
0.1%
fulloydez 14
 
0.1%
cartez 14
 
0.1%
browlerson 14
 
0.1%
briggston 13
 
0.1%
Other values (4880) 16833
99.1%
2023-12-11T12:07:41.263550image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 12691
 
10.8%
a 10251
 
8.7%
n 9155
 
7.8%
8493
 
7.2%
r 7707
 
6.6%
o 6563
 
5.6%
i 6456
 
5.5%
l 6231
 
5.3%
s 5299
 
4.5%
t 4552
 
3.9%
Other values (43) 40091
34.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 92010
78.3%
Uppercase Letter 16986
 
14.5%
Space Separator 8493
 
7.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 12691
13.8%
a 10251
11.1%
n 9155
10.0%
r 7707
8.4%
o 6563
 
7.1%
i 6456
 
7.0%
l 6231
 
6.8%
s 5299
 
5.8%
t 4552
 
4.9%
y 4093
 
4.4%
Other values (17) 19012
20.7%
Uppercase Letter
ValueCountFrequency (%)
S 1530
 
9.0%
C 1499
 
8.8%
B 1412
 
8.3%
M 1261
 
7.4%
A 1194
 
7.0%
P 987
 
5.8%
H 911
 
5.4%
G 848
 
5.0%
D 809
 
4.8%
W 742
 
4.4%
Other values (15) 5793
34.1%
Space Separator
ValueCountFrequency (%)
8493
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 108996
92.8%
Common 8493
 
7.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 12691
 
11.6%
a 10251
 
9.4%
n 9155
 
8.4%
r 7707
 
7.1%
o 6563
 
6.0%
i 6456
 
5.9%
l 6231
 
5.7%
s 5299
 
4.9%
t 4552
 
4.2%
y 4093
 
3.8%
Other values (42) 35998
33.0%
Common
ValueCountFrequency (%)
8493
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 117401
99.9%
None 88
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 12691
 
10.8%
a 10251
 
8.7%
n 9155
 
7.8%
8493
 
7.2%
r 7707
 
6.6%
o 6563
 
5.6%
i 6456
 
5.5%
l 6231
 
5.3%
s 5299
 
4.5%
t 4552
 
3.9%
Other values (42) 40003
34.1%
None
ValueCountFrequency (%)
é 88
100.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
True
4378 
False
4315 
ValueCountFrequency (%)
True 4378
50.4%
False 4315
49.6%
2023-12-11T12:07:41.491279image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Interactions

2023-12-11T12:07:37.614007image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:35.430708image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:35.876287image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:36.309178image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:36.805142image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:37.213008image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:37.684056image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:35.516634image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:35.958198image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:36.380742image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:36.877671image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:37.285288image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:37.754454image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:35.590126image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:36.033081image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:36.452317image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:36.946605image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:37.358936image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:37.825076image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:35.663441image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:36.108021image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:36.613409image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:37.011948image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:37.426436image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:37.890229image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:35.733124image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:36.172551image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:36.675541image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:37.074670image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:37.485391image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:37.959385image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:35.803592image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:36.237677image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:36.737956image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:37.141490image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2023-12-11T12:07:37.543806image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Missing values

2023-12-11T12:07:38.056195image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:07:38.217739image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

PassengerIdHomePlanetCryoSleepCabinDestinationAgeVIPRoomServiceFoodCourtShoppingMallSpaVRDeckNameTransported
00001_01EuropaFalseB/0/PTRAPPIST-1e39.0False0.00.00.00.00.0Maham OfracculyFalse
10002_01EarthFalseF/0/STRAPPIST-1e24.0False109.09.025.0549.044.0Juanna VinesTrue
20003_01EuropaFalseA/0/STRAPPIST-1e58.0True43.03576.00.06715.049.0Altark SusentFalse
30003_02EuropaFalseA/0/STRAPPIST-1e33.0False0.01283.0371.03329.0193.0Solam SusentFalse
40004_01EarthFalseF/1/STRAPPIST-1e16.0False303.070.0151.0565.02.0Willy SantantinesTrue
50005_01EarthFalseF/0/PPSO J318.5-2244.0False0.0483.00.0291.00.0Sandie HinetthewsTrue
60006_01EarthFalseF/2/STRAPPIST-1e26.0False42.01539.03.00.00.0Billex JacostaffeyTrue
70006_02EarthTrueG/0/STRAPPIST-1e28.0False0.00.00.00.0NaNCandra JacostaffeyTrue
80007_01EarthFalseF/3/STRAPPIST-1e35.0False0.0785.017.0216.00.0Andona BestonTrue
90008_01EuropaTrueB/1/P55 Cancri e14.0False0.00.00.00.00.0Erraiam FlaticTrue
PassengerIdHomePlanetCryoSleepCabinDestinationAgeVIPRoomServiceFoodCourtShoppingMallSpaVRDeckNameTransported
86839272_02EarthFalseF/1894/PTRAPPIST-1e21.0False86.03.0149.0208.0329.0Gordo SimsonFalse
86849274_01NaNTrueG/1508/PTRAPPIST-1e23.0False0.00.00.00.00.0Chelsa BulliseyTrue
86859275_01EuropaFalseA/97/PTRAPPIST-1e0.0False0.00.00.00.00.0Polaton ConableTrue
86869275_02EuropaFalseA/97/PTRAPPIST-1e32.0False1.01146.00.050.034.0Diram ConableFalse
86879275_03EuropaNaNA/97/PTRAPPIST-1e30.0False0.03208.00.02.0330.0Atlasym ConableTrue
86889276_01EuropaFalseA/98/P55 Cancri e41.0True0.06819.00.01643.074.0Gravior NoxnutherFalse
86899278_01EarthTrueG/1499/SPSO J318.5-2218.0False0.00.00.00.00.0Kurta MondalleyFalse
86909279_01EarthFalseG/1500/STRAPPIST-1e26.0False0.00.01872.01.00.0Fayey ConnonTrue
86919280_01EuropaFalseE/608/S55 Cancri e32.0False0.01049.00.0353.03235.0Celeon HontichreFalse
86929280_02EuropaFalseE/608/STRAPPIST-1e44.0False126.04688.00.00.012.0Propsh HontichreTrue